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An apparatus and method for formatting a 
specified group of related web psges into a single 
web page 3! tows a user to define a number of ' 
selected paces and associated relation criteria 
for each selected page. A formatting mechanism 
collects the URLs for the selected p3g»s and 
those related pages based on She relation criteria 
and stores the URLs In a URL container The 
formatting mechanism further invokes, each web 
page associated to the URLs contained ;n the 
URL container and generates a conglomerate 
psge. The conglomerate web page may include 
data insert into or referenced in one cr more of 
the selected pages. The conglomerate web page 
may then be printed using a standard browser 
print function. 
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FIELD OF THE INVENTION 

This invention generally relates to computer networks, such as the Internet. Mere soeaficaHv this invention 
relates so an apparatus and method for formatting web pages. 

BACKGROUND OF THE INVENTION 

The development of the EDVAC computer system of 1948 is often cited as the beginning of the computer 
era. Smce Inas time, computer systems have evolved into extfemeiv sophisticated devices and comou*er 
systems may be found ;n many different settings. The widespread proliferation of computers prompted the 
development of computer networks that allow computers to communicate wish each other With rp* 
introduction or the personal computer {PC), computing became accessible to taroe numbers of people 
Networks for personal computers were developed that allow individual users to communicate with each 
otner. in tns manner, a large number of people within a comoany could communicate at the same Vrr>e Mh 
s soitwars application running on one computer system. 

One significant computer network that has receotty become very oopular is the Internet. The internet grew 
out of tnts profifersiion of computers and networks, and has evolved Into a sophisticated worldwide network 
ot computer system resources commonly known as the "worid-wide-web", or WWW. A user at an individual 
PC (i.e. . workstation) that wishes to access the Interne; typically does so using a software apo-lic^on 
known as s wer> browser. A web browser makes a connection via the Internet to other computers known ?s 
weo servers, and feceives information from the web servers that Is dispfaved on the user's wodesia^n 
Information transmitted from tne wed server to the web browser is generally formatted using a specialized 
language called Hypertext Markup Language {HTML} and Is typically organized into pages known as wed 
pages Many web pages include one or more special reference locations known as "links" that invoke other 
web pages Links allow a web user to easily navigate to other web sites of interest by clicking on the 
appropriate link with a mouse or other pointing device 

Often 3 web user will want to print a web page being currently viewed. Web browsers typically have a phni 
runction thai ailows a user to print the current page. However, as the complexity of web sites increases it 
becomes increasingly difficult to locate needed information, and the process of Mating several related web 
pages becomes a tedious exercise that involves: invoking the web page, printing the web page, Invoking 
the next web page, printing, invoking, printing, etc. In other words, prior art browsers require a user to 
invoke a page before printing it. With these prior art browsers, if a user needs to orint 40 related web pages 
tne user must manually invoke and print each of the 40 web pgqes. Needless to say. this process become*' 
very time-consuming. 

As the number of internet users, providers, and web servers continues to raD=div expand, it will become 
increasingly important for a web user to be able to print related web pages without manually Invoking and 
pnrding eacn page Without improvements in the manner web pages are printed, the printing of web pages 
wil; continue to oe an impediment to the effective usage of resources available on the Internet. 

SUMMARY OF THE INVENTION 



According to the present inventors, an apparatus 3nd method for formatting a specified group of related 
web pages into a single web page is disclosed. A user defines a number of selected sages and associated 
relation criteria for each selected page. A formating mechanism collects the URLs for the selected pages 
and those related pages based on she relation criteria and stores the URLs in a URL container. The " 
formatting mechanism further invokes each web page associated to the URLs contained in the URL 
container and generates a conglomerate page The conglomerate web paqe may include data insert into or 
reterenced in one or more of she selected pages. The conglomerate web oage rnav then be printed usino a 
standard browser print function. 

The foregoing and other objects, features and advantages of the invention will be apparent from the 
following more particular description of preferred embodiments oi the invention, as illustrated in the 
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accompanying drawings . 

BRIEF DESCRIPTION OF THE DRAWINGS 

The preferred exemplary embodiments of the present invention will hereinafter be described in conjunction 
with the appended drawings, where like designations denote iike elements, and: 

FiG. 1 is a block diagram of s compter system in accordance with the present invention; 

FIG. 2 is a block diagram of a typical Interne; connection; 

FiG. 3 is a flow diagram of the method steps for formatting selected end related web pages in accordance 
with the preferred embodiment; 

FiG. 4!ss block diagram of the nesting structure used as the relation criteria in accordance with the 
preferred embodiment: 

FIG. 5 is a block diagram of a computer system that allows formatting of selected and related web pages in 
accordance with the preferred embodiment; 

FiG. 5 is a flow diagram of the method steps for collecting and formatting the selected and related web 
pages in accordance with the preferred embodiment; 

FIG. 7 is a pseudo-code representation of the recursive method of collecting the selected and related web 
pages in accordance with the preferred embodiment; 

FIG. 8 is a pseudo-code representation of the processing method of the URL container in accordance with 
the preferred embodiment: and 

FIG. 9 Is a pseudo-code representation of the flattening process in accordance with the preferred 
embodiment. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 
Overview 

The method and apparatus of the present inventor, has particular applicability to formatting web pages on 
the Internet. For those individuals who are not familiar with the internet, a brier oven/sew of relevant Internet 
concepts is presented here. 

An example of a typical Internet connection is shown In FIG. 2. A user that wishes to access information on 
the Internet 170 typically has a computer workstation 200 that executes an application program known as a 
wsb browser 210 Under the control of web browser 210. workstation 200 sends a request for a web page 
over the Internet 170. Web_page data can be sn she form of text, graphics and other forms of information, 
collectively known as iVUMfc data. Each web server on the internet has a known address, termed the 
Urvform Resource Locator (URL), which the web browser uses to connect to the appropriate web server. 
Because web server 22G can contain more than one web page, the user will also specify in the address 
which particular web page he wants to view on web server 220. A web server computer system 220 
executes a web server application 222, monitors requests, and services requests for which it has 
responsibility. When a request specifies web server 220, web server application 222 generally accesses a 
web page corresponding to the specific request, and transmits the page to the user's workstation 200. 

Web Pages 

A web pssge may contain various types csf Mi ME data. Most web pages include visual data that Is Intended 
to be dtspiayed on the monitor of user workstation 200. Web pages are generally written In Hypertext 
Markup Language (HTML). When web server 220 receives a web page request,' it will send the requested 
page in KTivlL form across the Internet 1 70 to the requesting web browser 210. Web browser 210 
understands HTML and interprets it and outputs the web page to the monitor of user workstation 200, This 
web page displayed on the user's screen may contain any suitable MIME data, including text, graphics, and 
links {which reference addresses of other web pages). These other web pages (i.e.. those represented by 
links) may be on the same or on different web servers The user can invoke these other web pages by 
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clicking on these links using a mouse or other pointing device. This entire system of web pages with Sinks to 
other web pages on other servers across the world is known as the "World Wide Web". 

The remainder of {his specification describes how the present invention improves the convenience of 
formating and priming related web pages by providing ways thai a user may format and print related web 
psges without she customary user interaction required to invoke and print each web page Those skilled in 
the art will appreciate that the present invention applies equally to the formatting and/or'pnntfng of any 
related oats, whether the data be in she form of web pages, database records, or other data that may be 
interrelated. " J 

DETAILED DESCRIPTION 

Referring to FIG. 1, a computer system 100 in accordance with the present invention includes s processor 
110. a main memory 120, a mass storage interface 140, and a network interface ISO, ail connected by a 
system bus 160. Those skilled in the art will appreciate that this system encompasses ail types of computer 
systems: personal computers, mldrange computers, mainframes, etc. Note that many additions, 
modifications, and deletions can oe made to this computer system 100 within the scope of tne invention. 
Examples of possible additions Include: a computer monitor, a keyboard, a cache memory, and oeripheraf 
devices such as printers. 

Processor 110 can he constructed from one or more microprocessors and/or Integrated circuits. Processor 
1 10 executes program Instructions stored in main memory 120. Main memory 120 stores programs and 
data that the computer may access. When computer system 100 starts up, processor 110 initially executes 
she program instructions that make up operating system 128. Operating system 126 Is a sophisticated 
program that manages the resources of the computer system 100. Some of these resources are the 
processor 110, main memory 120, mass storage interface 140, network interface 150, and system bus 160. 

Main memory 120 includes one or more application programs 122, data 124, operating system 126. a web 
page formatting mechanism 123, 3nd one or more web pages 130. Application' programs" 122 are executed 
by processor 110 under the control of operating system 1 26. Application programs "l 22 cars be run with 
program data 124 as input. Application programs 122 can also output their results as program dat3 124 in 
main memory. In the present invention, a computer system 100 includes a web page formatting mechanism 
128 that allows multiple related web pages to be formatted Into a single page, which rosy then be printed, 
downloaded to disk, placed on the internet or put to any other use known by one skilled in the art. 

Mass storage Interface 140 allows computer system 100 to retrieve and store data from auxiliary storage 
devices such ss magneto disks (hard disks, diskettes) and optical disks (CD-ROM). These mass storage 
devices are commonly known as Direct Access Storage Devices (DASD). and act ss a permanent store of 
information. One suitable type of DASD is 3 floppy disk drive 180 that reads data from and writes data to a 
floppy diskette 186. The information from the DASD can be In many forms. Common forms are application 
programs and program data. Data retrieved through mass stcraoe interface 140 is usualiv placed in main 
memory 1 20 where processor 1 1 0 can process if . 

While main memory 120 and DASD device 180 are typically separate storage devices, computer system 
100 uses well known virtual addressing mechanisms that allow the programs of computer system 100 to 
behave as st they only have access to a large, single storage entity, Instead of access to multiple, smaller 
storage entities {e.g., main memory 120 and DASD device 185}. Therefore, while certain elements are 
shown to reside In main memory 120, those skilled in trie art will recognize thai these are not necessarily all 
completely contained in main memory 120 at the same time, it should" be noted that the term "memory" is 
used herein to generically refer to the entire virtual memory of computer system "GO. in addition, an 
apparatus in accordance with the present invention includes any possible configuration of hardware and 
software that contains the elements of the invention, whether the apparatus is a single computer system or 
is comprised of multiple computer systems operating In concert. 

Network Interface 150 allows computer system 100 to send and receive data to and from any network the 
computer system may be connected to. This network may be a local area network (LAN), a wide area 
network {WAN;, or more specifically the internet 170. Suitable methods of connecting ;o the internet Include 
known analog and/or digital techniques, as well as networking mechanisms that are developed in the 
future. Many different network protocois can be used to implement a network. These protocols are 
specialized computer programs that allow computers to communicate across a network. TCP/IP 
(Transmission Control Protocol/internet Protocol), used to communicate across the Internet, Is an example 
of a suitable network protocol. 
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System bus 160 allows data to he transferred among the various components of computer system 100, 
Although computes" system 100 is shown to contain only a single main processor and 3 single system bus, 
those skilled In the art wiii appreciate that She press sit invention may be practiced using a computer svstem 
that has multiple processors and/or multiple buses. In addition, the interfaces that are used In the preferred 
embodiment may Include separate, fully programmed microprocessors that are used to ofMoad eomouie- 
Intensive processing from processor 110, or may include I/O adapters to perform similar functions. 

At this point, it is important to note that white the present Invention has been (and will continue to be) 
described in the context of a fully functions! computer system, those skilled in the art wiii appreciate that the 
present invention is capable of being distributed as a program product in a variety of forms, and that the 
present invention applies equally regardless of she particular- type of signal bearing media used to actually 
carry out the distribution. Examples of suitable signal bearing media include; recordable type media such as 
floppy disks (e.g., 186 of FIG. 1} and CD ROM, and transmission type media such as digital and analog 
communications links. 

The remainder of this specification will describe the preferred embodiments for the web page formatting 
mechanism 12S which takes a number of selected web pages, collects the URLs, and creates a single 
document which may be termed a congiomeraie or flattened web page. The term "flattened web page" is 
used herein to convey with imagery that several related pates in a typical cross-linked tree-like hierarchy 
are all assembled or "flattened" into a single page, thereby removing cross-links and piscina the various 
pages In sequential order. 

Referring now to FIG. 3, method 300 for formatting a web page starts by the user defining a list of URLs 
and a relation criteria for each selected URL {step 310). This may be achieved in many d(fferent wsys; 
mechanically through inputs from a human user info a menu screen, by retrieving a list of URLs from a web 
browsers historic memory, or any other method known by one skilled in She art to specify URL data, Once 
the list of URLs and She relation criteria for each URL are defined, a URL container is created with ail the 
selected and related URLs (step 320). Finally, a flat p3ge is generated from the selected and related URLs 
within the flat container (step 330). 

The relation criteria Is an Important element in the formatting process because it defines the requisite 
association that must exist between a number of URLs to be deemed "related" URLs and therefore defines 
which pages to include in the flattened page The criteria for whether or not two URLs are "related" may 
vary within the scope of the invention. One specific relation criteria Is referred to herein as "nesting levels", 
and is explained with reference to FIG. 4. 

Pages with finks to each other may be arranged in a tree-like structure 400 as shown in FIG, 4. Nesting 
structure dOG "as at least one selected web page 41 1 {e.g., first selected web page 41 1 and/or second 
selected web page 450} with a number c-f links 421-439 {i.e.. Link 1, Link 2, Link 3) to other pages 441-448 
The links each comprise a mechanism to invoke a web page, such as a URL, which may be' activated by 
the user. When a user defines the nesting level, they are determining the depth Into the nesting free 400 
which the formatting mechanism reaches to find related URLs. For example, if a user chooses'first selected 
web page 41 1 as their selected page and defines a relation cuteng of two nesting levels to collect the 
related URLs, the related URLs comprise the URLs for the first selected web page 41 1 and trie URLs for all 
the links contained in those web pages that are directly linked to the first selected web page, namely; the 
URLs for first selected web page 411, Link 1 web page 441. Link 2 web page &42, Link 3 web page 443, 
Link A web page -144, Link 3 wed page 445. Link D web page 446, Link F web page 447 and Link G web 
page 448. ff the nesting level was set to three then the related URLs would include those defined for She 
two nesting level case and wouki additionally include the URLs for Link I 429, Link fl 430 Link HI 43i Link 
fV 432, Link V 433, Link VI 434, Link Vli 435, l,ok VI It 438. Link IX 437, Link X 438 and Link XS 436. ' 

Other suitable relation criteria for relating URLs include: whether or not the URLs are on the same web 
server, whether a specific search word appears in the web URLs search hst: whether there is 3 link 
between the URLs: or whether the URLs have the same base address. 

An example of b3se address relation criteria fellows. A home page may have the address 
WvW.corpc-r3tionX.com/home.htmi, and any URLs that have the base address www.corpor8flonX.com are 
rotated to the home page. In another example, a URL at address www.corpor3tionX.corn/support/index.htrnl 
is selected, and any URLs that share the base address www.corporatlonX.com/sopport are related to the 
selected URL, while other URLs a? this ate are not related. Regardless of the specific relation criteria used, 
URLs that are related are formatted into a single web page, as discussed in rrto're detail below. 
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A better understanding of the above described steps may be obtained through the following exa^pi* 0 f t-e 
Purred embodiment. Referring now to RGS. 5-6, apparatus 500 in accordance w*h the pr*^ re7 
™^^Z^ h ??? M& T- S W&b Cfent 200 "* a server 220 cor^d ta the 

L?i it 8 ! 0 * 1 £? ,nCiUd8S 8 weD Drowser sppijcaiion 210 and a web paoa formats 
melanism 12s. Tne web browser application 210 is a standard web browser kr»wn in the a* Web oa- e 
formates mechanism 128 Includes a web page selection mechanism 540, web page sS mechar^rf 
?c'; r d ^ P39e consi0fT f r8Uon mechanlSf ^ 560- While web page formatting mlchanism 128 is shown 
<n ^ 5 as besng separate Uorr, web browser 2:0, in the best mode of the invention it & ^motete^t 
weo page formatting mechanise 1 28 wilt be integrated into a web bowser appTcSn. feiby p ™5*n£ 
browser vwth advanced formatting capabilities for related web pages, such as prints dowloadi'r4 
dement transfers, etc. In the alternative, web page formatting mechanism 128 mS'bf a ^S5" 
°? 200, or may be a piug-in or Java applet/application for web browser 

«PPKC3*>n *10 ? he functons of web page formatting mechanism 128 are described herein wi^-t-c^d 
to whether mechanism 126 reside, within web browser application 210 or o££ ol ^owst 9 



f sb p f 9S Se!ect, ? n m echanism 540 is used to create a list of user selectee and related web r*qes VV-b 
pags storing mechanism S50 stores the list of selected and related web paqes in a URt ffmat 
Cong:omer 3 tion mecnanisrr* 550 takes the selected URLs and formats them into a nattered weh race 
Because apparatus 500 flattens many lined web pages Into a single conglomerate web pai t^and-d 
pnnt fcncuon supplied with any browser will print the conglomerate web Sage. The Scttn of ^chSlmt 
540-oSQ may best &e understood with relation to the flow diagram of RG. 6 ' 

A method 600 for formatting multiple related web pages or URLs begins bv selectino. the w*h n?n*s or 
URls wr.cn are to be explicitly included in the flattened web page (step 610;. OncShe !veb p have 
i" 3 "P^W ,s spedfied for each reeled web page (step 620). The d-goino level <s a 
?rJ? C of a .r t8ble re!a «™ criteria, that is equivalent to the nestlnq IsCet disced wah 

SfZf f , " Wh8n f fT web »« «* ** «m level, have been defined, a URL list of the 
u^o^mt^T^^ ieVei iS Cr83ted {3tep 630) This ■« * then in a format which mav be 
Z% ? f *: ! " s} iS P f0cessed ginning with step 640. If the URL list Is not emoty (step 
o4-J ; rvj}, me next uRl ,n tne list iS retrieved {step 650). This selected URL is then added to the URi 
coroner in preparation -or processing {step 660). Ail URLs related to the selected URi w then ~ 

ano collecting tne web pages {step 680}. The mechanisms and Interplay between web client ani w^b 
S^^f^ b f SG f e . weB : koow " ifl * Once the web" pages have been Invo d aliiened 
* w ^ ^ genet ateo from tne invoked web paces {step 690). preferably by appending *he ^elate* we* 
pages together in a smgie web page. The flattened (i.e., conglomerate) wet> page is in a form which 4- f» 
aTESSto rSSSlr! 0 ° Sher ^ ~* ^ ^ * — P*« or used" 

^^understanding of the de{aiis of SDf ^ of the steps of FIG. 6 we now refer to FIGS 7-9 M»*rt 

r.;^^?.^ 36 "? one sp€drlc escam P |e of 5 liable recursive collection orocess used to piace" 
J^ES? ^ COri ^ r;8 t r s ^ sp 670 >- Method 600 °* FSG - 8 ^presents one speafic example of 
Sl^n ^ S 6 ^ ! ? J comainer ' siep g80 >- Method 800 of FIG. 9 represents one soecifc 
example o? the generahon of a flattened web page {step 690) 

Referring now to FiG. 7, method 700 Is represented in recursive pseudo-code, and rec^rsivety vores m a 
URL contamer tne selected URLs and the URLs related bv a suitable relation criteria^ ie g v 
nesting ievets^ V^nen the Collect URLs method 700 is invoked, the URL List URL Con»airW and r%< i^e' 
must be specified. The URL list is the list of URLs that the ■ • w, "? 3 . ,r M:. a '- d c ^ 5ev& 




userKlefined dig level for the selected URL is greater than zero ("step 72^^e^oTe'ach Umt JS^ , .s 

?M*™^^- y ? li ^r CliS} iRt0 lhe ne3ting tree s5ruc ' lUre ( s «p 724) to collect the other 'related"' 
UN.* a: 0 p»ace tnem in tne URL container according to the relation criteria {i.e. diooinq ieveh This 
process ,s commued until all selected and ail related URLs are collected and Uced Tn the URI container 
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For exampte, if the selected URL. has three links to other pages, and the user-defined digs level is three: 
first, the selected URL is added to the URL container as the first dig level: second, the selected URL is 
invoked to produce the URLs for the three linked wets pages which are collected into She URL container as 
the second dig level; and third,, the three linked pages are separately invoked to produce the URLs for their 
linked web pages to define the third dig level. These URLs are again collected and placed into the URL 
container to compete the processing of the URLs in the URL list" 

Referring now to FIG. 8, method 300 is represented in pseudo-code that prepares the URLs for flattening 
Into a conglomerate page. For each URL in the URL container (step 810) the web page correspond^ to 
aach URL is invoked (step 820). All URLs referenced within the invoked web pages are collected In 
preparation for processing (step S30). Next, each URL collected In step 830 is processed {step 840). if the 
URL references 0 URL within the URL container {step 842), the first occurrence of the URL is marked as 
the targes using the NAME attribute (step 846}, while all subsequent occurrences of the URL are cross- 
referenced back to the target reference by modification of the HREF tags to include the text "See Section— 

— " where " " is the text specified m the target HREF (step 848). For example, for an anchor of the 

form <A HAMB~"K"> Chapter X</A> where Chapter X is included In the URL container, the description 
inserted into the text wouid be "See Chapter X", If the URL references a URL not within the URL container, 
the associated HREF attribute Is modified to include the text "Section Not included" {step 850). This 
process is continued until ail URLs in the URL container have been identified as a target, a cross-reference 
or identified as being "Not included." 

Referring now to FIG. 9, method 900 is represented in pseudo-code that processes the modified URL 
container at the conclusion of method S00 (RG. 8} to generate the flattened or conglomerate web page. For 
each URL in the URL container (step 910}, each URL is invoked to produce the associated web page" {step 
920). Next, the URLs referenced In the page are collected (step 930}. For each URL collected 'stesT^Q), if 
the HREF statement includes sn EMBED attribute (step 942), portions of the page specified in the HREF 
statement mat includes the EMBED attribute are Inserted into the page {step 944). The new page is then 
added to tne flattened page file, by preferably appending the page to the end of the flattened page file {step 
950). The BvftsED attribute and other new atin&utes defined by the present invention are discussed in more 
detail below. 

As a basic leve*, a web page Is made up of various fields of information separated by special delimiters 
known as lags". Tags fell the web browser what lo do with the Information In a particular field. For 
example, tags may cause the web browser to display an image, to display text, to play an audio message, 
or to display text in a special field known as 3 hypertext link. A hypertext link is a referencing mechanism for 
identifying remote resources located anywhere within the virtual memory space of the system, whether it be 
or? ;he same computer, a secondary storage device, or a remote computer over a network, in other words, 
a Sink identifies the address -e.g., URLi which She computer should invoke when the Sink Is selected by a 
user, typically by clicking on the link with a mouse or other pointing device. A hypertext link Is defined in 
HTML using "anchor" tags. The tag that defines the beginning of the anchor is <A> and the lag that defines 
She end of the anchor is <;a> Anchors may include attributes such as HREF and NAME. The HREF 
attnbuie specifies tns hypertext reference (e.g., URL) for the link. The NAME attribute places a marker in a 
page that can be used by a link to specify a particular location or section in the page. Specifying a name 
tells the browser where to begin displaying data. For example, NA?v1E-"X" marks s page with text X to 
name a section of the HTML page. The present invention defies an additional attribute that defines the 
end of a named field or section. The end attribute is similar, but reads NAfv1E="X.end ! \ thereby ending s 
section. IS is important to note mat 3 browser wiii simply ignore any tags or attributes that it doesn't 
recognise. This feature allows a web page designer to add special tags or attributes that a particular 
browser may be able to recognize and process -such as NAME^'Xend"). while assuring that the same 
page wilt be displayed without problem on existing browsers. 

Another way of fot matting HTML sections is to assume that trte section runs from one NAME attribute to a 

NAME attribute that, at some point In the page, defines a new section For example, a table of contents 

rnav include the following URLs: 

myUri#a 

myUrl#b 

myUri#c 

rrsyUh#d 

The page myUr; mav include she fciiowino sections: 
<A NAMH=a></A>{HTML data) 
<A NAME=30></A><HTML data) 
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<A NAME-anx/A >{ HTML data) 
<A NAME=&x/A><HTML data) 
<A NAME-bO></A>fHTML data) 
<A NAM£=bn></A>(HTML data) 
<A NAME -ex a> ( HTML data) 
<A NAME^d></A>{HTML data) 
<A NAME~dOx/A>{HTML date) 
<A NAME=dn></A>(HTML data) 

Assuming that an entire document will be printed by the various entries in a table of contents, if section b is 
referenced, we can assume thai we need ait HTML from the <A NAME-b></A>tag to the <A 
NAME-cx;A>tag. Note that t hs end of the section may be defined by the very next tag encountered, or. 
as in the example above, by a tag with a different label, causing sections bO and on to be Included in the 
reference to section b. while section c is identified as a different section, in this manner the X.end attribute 
defined above is not needed if st is safe to make certain assumptions about the end points of HTML 
sections. 

Another attribute defined by the present invention is an EMBED aitrioute. in the preferred embodiment, 
another attribute EMBEDSRC is a specific example of a speeiaipurpose EMBED attribute, for the purpose 
of illustrating the concepts of the present invention, the EMBED attribute is used to embed ordinary text in 
an HTML page marked off between a NAME="X" attribute and a NAME="X.snd" attribute or between a 
NAME-"X" attribute and a following NAME attribute that defines the beginning of a following section 
^thereby defining the end of the previous section). The EMBEDSRC attributed used to embed text that 
represents a portion of a source code listing. Having source code embedded with different visual 
characteristics allows 3 programmer to display source code in a special format (e.g., cotor, font, font size, 
etc.) in a page of documentation for a computer program. These attributes are placed within an HREF 
statement. For example, the HREF statement to embed source code may read <A 
HREF-'mysource java#method' ! EM3ED3RC>See method</A> while the HREF to embed an HTML 
section reads <A H*REF~"mysource.htrni#section" EMSED>3ee method</A>. These new attributes identify 
the embedding location where the finked information may be inserted during formatting of the web page, 
describees in method 900 and FIG. 9, 

Other new ancner attributes defined by the present Invention are FOLLOW, SHOULDFOLLOW and 
NOFOLLOVV. These are again placed in the HREF statement to indicate whether a Sink must be, should be 
or must not be followed, respectively. The following is an example of one possible use of the FOLLOW 
attribute: 

<A HREF="mysource.htnii#sscticn" FOLLOW* See method</A>, A URL wish a FOLLOW attribute is 
included If the referenced URL contains information {such as critical information) that should be included ;n 
the conglomerate web page even If the nesting level would dictate otherwise. A NOFGLLCW attribute does 
ins opposite, not including the referenced URL even if the nesting level would have induced it. A 
SHOULDFOLLOW attribute Is also provided that Induces the references' materia; Into the conglomerate 
web page if the relation criteria indicates to follow the links marked with the SHOULDFOLLOW attribute. 
The SHOULDFOLLOW and NOFOLLOVV attributes are used in an HREF statement in the same manner as 
the FOLLOW attribute. It should be noted that additional tags or attributes may be created to do other 
operations like compressing or expanding data, reformatting data to another sources type or other 
operations Known by one skilled in the art. 

How these attributes are processed depends on the relation criteria specified by the user. For example, the 
user may specify to strictly adhere to the nesting level ignoring any FOLLOW, SHOULDFOLLOW, or 
NQFQLLOW attributes encountered. In the alternative, the user may specify a relation criteria that includes 
all URLs that have the FOLLOW attribute, excludes those that have a NOFOLLOW attribute, and excludes 
those that have a SHOULDFOLLOW attribute. In yet another alternative, URLs wlm a FOLLOW or 
SHOULDFOLLOW attribute are included in the conglomerate web page while the URLs that have the 
NOFOLLOVV attribute are expressly excluded. The attributes provided by the present invention allow a user 
to specify a more sophisticated relation criteria that may be more easily tailored to meet certain needs. Note 
that ali of the new attributes defined herein are collectively referred to as "embedding attributes " 

For method 900 of FIG. 9, each related web paqe Is scanned to determine where the new attributes 
discussed above are located, i,e , EMBED, EMBEDSRC, NAME-'X.end", FOLLOW, SHOULDFOLLOW, 
NOFOLLOVV, etc. (steps 940 and 942). Ail the NAME tags that specify the end of a named section are 
located to identify sections of data. Once the secisoos of da!a have been specified, any EMBED and 
EMBEDSRC tags are located, and me corresponding associated sections are inserted Into the 
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corresponding related web page at the corresponding EMBED or EM8EDSRC locations (step 944}. This 
process is continued until ail the URLs in the URL container that have the EMBED or EMBEDSRc'aitfibute 
have been inserted into the conglomerate page. Ail she selected pages are collected and placed Into the 
new conglomerate page to complete the web page formatting process. 

it should be noted that the preferred embodiment of the web page formatting mechanism 128 uses HTML 
tags to idenjify sections and Insertion points. However, one skilled in the art wiiS recognize that the seme 
operations may be performed using other languages and systems, for example Javaand JavaScript. 

The HTML attributes discussed above, in the preferred embodiment of the invention, are already contained 
within She codes of the selected and related web pages. However, a method in accordance with the present 
invention allows a user to Insert the above-discussed attributes into an existing web page e;ther manually or 
dynamically to allow all existing web pages to be used In the formatting process of the present Invention. 
This method Is preferably an interactive process where the HTML of existing pages is scanned for HREF 
statements, and the user is given the opportunity to insert any of the newly-defined attributes above (or 
other attributes or tags) as appropriate. A too! for performing this conversion of existing HTML allows a user 
to quickly convert pages to a format in accordance with the present invention that allows formatting 
mechanism 1 28 to process these pages as -t they were originally developed using the attributes defined 
herein, 

While the inventors has been particularly shown and described with reference to preferred exemplary 
embodiments thereof, It will be understood by those skilled in the art that various changes in form and 
details m3y be made therein without departing from the spirit and scope of the invention, far example, 
while the preferred embodiments herein are discussed in terms of HTML pages, other page formats and 
data formats are equally encompassed by the present invention. The term page as used herein is intended 
to encompass any quantum of data that may be processed or displayed, in addition, while the invention is 
shown for exemplary purposes with regard to web clients and web servers that communicate over the 
Internet, the present invention applies to any type of client/server scenario on any suitable network. 
Furthermore, the use of URLs to collect and format a number of pages into a single web page Is only one 
method of collecting the desired data within the scope of the present invention. Also, the oescriptlon'hersin 
refers to a "user" that may perform certain functions. The term "user" as used In the specification and 
claims herein expressly includes any agent that may perform the functions of a user, including without 
limitation human users, computer functions, and software programs in any form. 
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Apparatus and method for formating web page 

Claims of corresponding document : USS0S1 700 ; Tractate this text ' 



What is claimed is: 

1. An appara&s comprising: 

at least one processor; 

a memory cot; pied to the at ieast one processor; and 

s web page formatting mechanism residing in the memory and executed by the at least one processor, the 
web page formatting mechanism Identifying from si least one selected web page a pint allty of links on the 
st ieast one selected web page that each reference a web page, the formatting mechanism identifying from 
the Identified links at least one web psge that is related to the at least one selected page, the formatting 
mechanism generating a conglomerate web page from the at least one selected web page and the at least 
one related web page. 

2. The apparatus of claim 1 wherein the formatting mechanism comprises: 
a mechanism for selecting at least one web psge; 

a mechanism for storing the at ieast one selected web page and at least one related web page, and 
a mechanism that generates the conglomerate web page from the stored web pages. 

3. The apparatus of claim 2 wherein the mechanism for selecting the at least one web page comprises a 
mechanism that determines from a user the selected at least one web page and at ieast one relation criteria 
for relating at least one of the related web psges to the at ieast one selected web pages. 

4. The apparatus of claim 2 wherein the mechanism for storing the at least one selected web page and the 
a? ieast one related web page comprises a mechanism for determining from the si least one selected web 
page the at least one related web page. 

5. The apparatus of ciaim 2 where;n the mechanism that generates the conglomerate page comprises: 
3 mechanism that invokes the at least one selected web page and the plurality of related web pages 
searching for at least one embedding code; and 

an embedding mechanism for embedding a portion of a referenced web page into a portion of the 
conglomerate web page according to the embedding code. 

3. The apparatus of claim 5 wherein she embedded portion comprises source code. 
7. The apparatus of ciaim 5 wherein the embedded portion comprises an HTML section. 

6. The apparatus of claim 1 wherein the selected web pages are selected using a Uniform Resource 
locator (URL). 

9. The apparatus of claim t wherein the selected web page is a hypertext markup language (HTiVtl) page. 

10. The apparatus of ci3;m 1 wherein two web pages are related if either of the two web pages are within a 
predetermined nesting leva: with respect to the other web pace 

1 1 . The apparatus of claim 1 wherein two web pages are related if either of the two web pages have a link 
to the other. 

12. The apparatus o* ciaim 1 wherein two web pages are related if the two pages reside on the same 
server. 

'3. The apparatus of claim 1 wherein two web psges are related if the two web pages have the same base 
address 

14. The apparatus of ciaim 1 further comprising a mechanism for printing the conglomerate web page. 

15. An apparatus comprising: 
at least one processor; 
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a memory coupled to the at least one processor: 
a plurality of web pages residing In the memory; and 

a web page formatting mechanism residing in she memory and executed by the at teas? one processor, the 
web page formatting mechanism comprising: 

a mechanism for selecting at least one web psge from the plurality of web pages; 

a mechanism for identifying from the at least one selected web page a plurality of hnks on the at least one 

selected web psge that each reference a web page; 

a mechanism for identifying from the identified links at teast one web page that Is related to the at feast ens 
selected page: 

a mechanism for storing the at least one selected web page and the at least one related web page; and 
a mechanism that generates 3 conglomerate web page from the stored web pages. 

16. The apparatus of claim 15 wherein at least one of the plurality of web pages includes at least one 
embedding attribute. 

1 7. The apparatus of claim 18 wherein the st least one embedding attribute comprises a- least one attribute 
that at least partially determines whether or not a sink to another of the plurality of web pages is followed. 

18 The apparatus of claim 15 wherein at ieast one of the plurality of web pages includes at least one 
attribute that defines the end of at least one section of the at least one web page. 



19. A program product comprising: 

(A) a web psge formatting mechanism, the web page formatting mechanism identifying from at least one 
selected web page a plurality of links on the at least one selected web page that each reference a web 
page, the formatting mechanism identifying from the identified links at least one web page that is related io 
the at least one selected page, the formatting mechanism generating a conglomerate web page from the at 
less- one selected page and the at feast one related web page; and 
{B ■ signal bearing media bearing the web page formatting mechanism. 

20. The program product of claim IS wherein the signal bearing media comprises recordable media. 

21 The program product of claim 19 wherein the signal bearing media comprises transmission media. 

22 The program product of claim 19 wherein two web pages 3re related if either of the two web pages 
have a link to the other. 

23. The program product of claim 19 wherein two web pages are s elated If the two web pages reside on the 
same server. 

24. The program product of claim 19 wherein two web pages are related if the two web pages have the 
same base address. 

25. A program product comprising: 

{A} a web page formatting mechanism, the web page formatting mechanism Including: 
a mechanism for selecting at ieast one web page; 

a mechanism for Identifying from She at least one selected web page a plurality of Sinks on the at ieast one 
selected web page that each reference a web page; 

a mechanism for identifying from the identified links at least one web page that ss related io the a? ieast one 
selected page: 

a mechanism for storing the at least one selected web page and the at teas! one related pace: and 
a mechanism that generates a conglomerate web page from the stored web pages; 
{3} signal bearing media bearing the web page formatting mechanism. 

23. The program product of claim 25 wherein the signal bearing media comprises recordable media. 

27. The program product of claim 25 wherein the signal bearing media comprises transmission media. 

28. The program product of claim 25 wherein the web page formatting mechanism further comprises a 
mechanism for printing the conglomerate page. 

29. The program product of claim 25 wherein the selected web page Is selected using 
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30. The program product of claim 25 wherein the selected web page is a hypertext rnarkuc- language 
{HTML) page. 

31 . A method for formatting a number of reiated web pages Into a conglomerate web page, She method 
including the steps of: 

(A) selecting at least one web page and at least one relation criterion for the at least one web page; 

(8) identifying from the at teas: one selected web page a plurality of links on the at least one selected web 

page that each reference a web page; 

(C) Identifying from the identified links and the at least one relation criterion at least one web page that is 
related to the at least one selected page; 

(D) storing the at least one selected web page and the at least one related web page; and 
{t} generating the conglomerate web page from the stored web pages. 

32. A method for reformatting an existing web page, the method including the steps of: 
determining from the existing web page at least one reference In the existing web page to data to 
incorporate into the existing web page: 

modifying the existing web pace by inserting at Isasl one embedding code into the existing web page that 
identifies the location of the data in the existing web page and that identifies the data to be incorporated into 
the existing web page. 

33. The method of claim 32 wheresn the at least one reference In the existing web page comprises a URL. 

34. The method of claim 32 wherein the data to be incorporated Into the existing web pace comprises 
MME data. 

35. A method for formatting and printing s number of related web pages as a single document, the method 
including the steps of: 

selecting si least one URL corresponding to at least one selected web page; 
selecting a relation criteria for eacn selected URL, 

recursively collecting all related URLs for each selected URL according to the corresponding region 
criteria; 

invoking the at least one selected web page and the reiated web pages corresponding to the reiated URLs: 
storing the at least one selected web page and the related web pages in tne single document, the at leas! 
one selected web page and the related web pages including at least one embedding attribute specifying at 
least on portion of at least one referenced web page to be embedded; 

inserting She at least one portion of the at least one referenced web page specified by the at least one 
embedding code into the single document at the corresponding web page; and 
printing the .single document. 

36. The method of claim 35 wheresn the at least one embedding attribute comprises at leas- one attribute 
(hat at least partially determines whether or not a link to another of the plurality of pages Is followed. 

37. The method of cl3im 35 wherein she at least one embedding attribute defines the end of at least one 
section of the at least erne page. 
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